Handling Translation Divergences in Generation-Heavy Hybrid Machine Translation
نویسنده
چکیده
This paper describes a novel approach for handling translation divergences in a Generation-Heavy Hybrid Machine Translation (GHMT) system. The approach depends on the existence of rich target language resources such as word lexical semantics, including information about categorial variations and subcate-gorization frames. These resources are used to generate multiple structural variations from a target-glossed lexico-syntactic representation of the source language sentence. The multiple structural variations account for diierent translation divergences. The overgeneration of the approach is constrained by a target-language model using corpus-based statistics. The exploitation of target language resources (symbolic and statistical) to handle a problem usually reserved to Transfer and Interlingual MT is useful for translation from structurally divergent source languages with scarce linguistic resources. A preliminary evaluation on the application of this approach to Spanish-English MT proves this approach extremely promising. The approach however is not limited to MT as it can be extended to monolingual NLG applications such as summarization.
منابع مشابه
Handling Translation Divergences: Combining Statistical and Symbolic Techniques in Generation-Heavy Machine Translation
This paper describes a novel approach to handling translation divergences in a Generation-Heavy Hybrid Machine Translation (GHMT) system. The translation divergence problem is usually reserved for Transfer and Interlingual MT because it requires a large combination of complex lexical and structural mappings. A major requirement of these approaches is the accessibility of large amounts of explic...
متن کاملA Hybrid Machine Translation System Based on a Monotone Decoder
In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...
متن کاملHandling Structural Divergences and Recovering Dropped Arguments in a Korean/english Machine Translation System ?
This paper describes an approach for handling structural divergences and recovering dropped arguments in an implemented Korean to English machine translation system. The approach relies on canonical predicate-argument structures (or dependency structures), which provide a suitable pivot representation for the handling of structural divergences and the recovery of dropped arguments. It can also ...
متن کاملMatador: Spanish-English GHMT
This paper presents the online demo of Matador, a large-scale Spanish-English machine translation system implemented following the Generation-heavy Hybrid Machine Translation (GHMT) approach.
متن کاملThe Use of a Structural N-gram Language Model in Generation-Heavy Hybrid Machine Translation
This paper describes the use of a statistical structural N-gram model in the natural language generation component of a Spanish-English generationheavy hybrid machine translation system. A structural N-gram model captures the relationship between words in a dependency representation without taking into account the overall structure at the phrase level. The model is used together with other comp...
متن کامل